English-Latvian Toponym Processing: Translation Strategies and Linguistic Patterns

نویسندگان

Tatiana Gornostay

Inguna Skadiņa

چکیده

The paper presents a study of a challenging task in machine translation and crosslanguage information retrieval – translation of toponyms. Due to their linguistic and extra-linguistic nature, toponyms deserve a special treatment. The overall translation process includes two stages of processing: dictionary-based and out-ofvocabulary toponym translation. The latter is divided into three steps: source string normalisation, translation, and target string normalisation. The translation process implies an application of translation strategies and linguistic toponym translation patterns. Possible translation strategies, including transliteration and translation per se along with combined strategies, and linguistic toponym translation patterns, including multi-word patterns as well, were investigated and implemented for English-Latvian machine translation. 10,000 The UK-related toponyms from Geonames were selected for a development set. The evaluation of output quality on basis of a test set has showed 67% accuracy in out-ofvocabulary translation: 58% on a set containing one-word toponymic units and 81% on a multi-word test set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pattern-based English-Latvian Toponym Translation

Due to their linguistic and extra-linguistic nature toponyms deserve a special treatment when they are translated. The paper deals with issues related to automated translation of toponyms from English into Latvian. Translation process allows us to translate not only toponyms from a dictionary, but out-of-vocabulary toponyms as well. Translation of out-of-vocabulary toponyms is divided into thre...

متن کامل

Topicalization in English Translation of the Holy Quran: A Comparative Study

The Holy Quran, as an Arabic masterpiece, comprises great domains of syntactical, phonological, and semantic literary patterns. These patterns work as the shackle of translators. This study examined the application of the most common shift strategies in Catford‟s linguistic model for translation of topicalization in chapter 29 of the Holy Quran. The topicalized cases were compared to their coun...

متن کامل

Toward a Comparable Corpus of Latvian, Russian and English Tweets

Twitter has become a rich source for linguistic data. Here, a possibility of building a trilingual Latvian-Russian-English corpus of tweets from Riga, Latvia is investigated. Such a corpus, once constructed, might be of great use for multiple purposes including training machine translation models, examining cross-lingual phenomena and studying the population of Riga. This pilot study shows that...

متن کامل

The Representation of Non-Linguistic Sounds in Persian and English Subtitles for the Deaf and Hard-of-Hearing: A Comparative Study

Subtitling for the deaf and hard-of-hearing (SDH) is an area which deserves a special attention as it ena- bles these people to access to the part of the ‘world’ intended for hearing people, including the world of ‘motion pictures’, and particularly movie sounds. Compared to linguistic sounds, non-linguistic sounds have received little attention in the field of translation, although they are in...

متن کامل

Exploring the Translator\'s Solutions to the Translation of Conversational Implicatures from English into Persian: the Case of Tolkien\'s the Lord of the Rings

The present study aimed to examine the translatorchr('39')s solutions to the translation of conversational implicatures from English into Persian. To do so, 120 conversational implicatures were extracted from the novel the Lord of the Rings (Tolkien, 1954) and classified based on Gricechr('39')s (1975) categorization of Maxims, including quality, quantity, relevance, and manner. Mur Duenaschr('...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

English-Latvian Toponym Processing: Translation Strategies and Linguistic Patterns

نویسندگان

چکیده

منابع مشابه

Pattern-based English-Latvian Toponym Translation

Topicalization in English Translation of the Holy Quran: A Comparative Study

Toward a Comparable Corpus of Latvian, Russian and English Tweets

The Representation of Non-Linguistic Sounds in Persian and English Subtitles for the Deaf and Hard-of-Hearing: A Comparative Study

Exploring the Translator\'s Solutions to the Translation of Conversational Implicatures from English into Persian: the Case of Tolkien\'s the Lord of the Rings

عنوان ژورنال:

اشتراک گذاری